Query Performance Appraisal using SPARQL & Map Reduce Technique on Web Semantics
نویسندگان
چکیده
The Semantic Web is an emerging technology which aims at making data across the globe semantically connected. The data is represented in a very simple statement like construct having a subject, predicate and an object. This can be visualized as a graph with the subject and the object as nodes and the predicate as an edge connecting the two nodes. When many statements like these are collected together they forms an RDF graph. There are RDF query languages to query such data, and SPARQL is one of them. According to the SP 2 Bench performance benchmarks, the SPARQL queries are very slow for RDF data with millions of triples. Hence, we aim to develop a Query Performance Appraisal using SPARQL & Map Reduce Technique on Web Semantics model of parallelization and hypothesize that this system will outperform the scalability and performance reported by the SP 2 Bench. We extend ARQ, an open source SPARQL query engine provided by the Jena framework, to work with the Hadoop Map Reduce framework and implement distributed SPARQL query processing. This thesis provides the detailed implementation and algorithmic details of our work. We contribute two novel methods to optimize RDF query engine which exploits document indexes and a join preprocessing technique. The experimental results show the merits and demerits of using Map Reduce for distributed RDF query processing and provide us a clear path for future work.
منابع مشابه
SPARQL-ST: Extending SPARQL to Support Spatiotemporal Queries
Spatial and temporal data is plentiful on the Web, and Semantic Web technologies have the potential to make this data more accessible and more useful. Semantic Web researchers have consequently made progress towards better handling of spatial and temporal data.SPARQL, the W3C-recommended query language for RDF, does not adequately support complex spatial and temporal queries. In this work, we p...
متن کاملLearning-Based SPARQL Query Performance Prediction
According to the predictive results of query performance, queries can be rewritten to reduce time cost or rescheduled to the time when the resource is not in contention. As more large RDF datasets appear on the Web recently, predicting performance of SPARQL query processing is one major challenge in managing a large RDF dataset efficiently. In this paper, we focus on representing SPARQL queries...
متن کاملSPARQL with property paths on the Web
Linked Data on the Web represents an immense source of knowledge suitable to be automatically processed and queried. In this respect, there are different approaches for Linked Data querying that differ on the degree of centralization adopted. On one hand, the SPARQL query language, originally defined for querying single datasets, has been enhanced with features to query federations of datasets;...
متن کاملViewing and Querying Topic Maps in terms of RDF
Both Topic Maps and RDF are popular semantic web standards designed for machine processing of web documents. Since these representations were originally created for different purposes, they have conceptual differences in their data models, and therefore have different tools to parse, store, and query them. However, there are more tools to handle RDF data than those existing for Topic Maps. Our ...
متن کاملEfficient SPARQL Query Processing via Map-Reduce-Merge
The move towards a “semantic web” is driving the need for efficient querying ability over large datasets consisting of statements about web resources. RDF is a set of standards for describing and modeling data and is the backbone of the semantic web technologies. RDF datasets can be very large, and often are subject to complex queries with the intent of extracting and infering otherwise unseen ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015